[00:03.720 --> 00:14.660]  Sorry about the no sound, people. I will figure this out. Every day,
[00:14.660 --> 00:25.960]  it just decides to grab the wrong audio input. So let's go back to the beginning and see if
[00:25.960 --> 00:31.920]  this works. Hi everyone. Today's talk is about machine learning privacy. The title of the talk
[00:31.920 --> 00:39.100]  is secrets are lies, sharing is caring, privacy is theft. My name is Nahid Farhadi and I'm a
[00:39.100 --> 00:45.520]  software developer in Capital One. I will be giving this talk with my colleague Vincent Pham.
[00:45.520 --> 00:51.160]  The outline of our talk is as below. We will give an introduction on the importance of privacy
[00:51.160 --> 00:55.260]  and we will talk about privacy attacks, we propose some defense techniques,
[00:55.260 --> 01:00.280]  and finally we will show a demo of an attack as well as defense techniques.
[01:01.920 --> 01:08.300]  Machine learning needs data. So what if this data has sensitive information like medical data
[01:08.300 --> 01:16.620]  or financial data? Even if we trust the algorithms that are generating the model or model developers,
[01:16.620 --> 01:22.080]  sometimes just using the statistics by looking at the output of a model, we can find out some
[01:22.080 --> 01:28.260]  information about the input of the model. This type of threat to ML can be black box or white
[01:28.260 --> 01:34.260]  box, meaning that the adversary doesn't necessarily need to have any access to the
[01:34.260 --> 01:41.760]  model specifics. So what is really privacy? Privacy means that if we are using some data
[01:41.760 --> 01:48.560]  from our users for training a model, but just by looking at the output of the model,
[01:48.560 --> 01:52.440]  no one should be able to get information on the input of the model.
[01:53.760 --> 02:00.480]  There is a trade-off between privacy and utility. If we have too much privacy,
[02:00.480 --> 02:05.920]  then it will be very difficult to use the model. If we don't have much privacy, then
[02:05.920 --> 02:14.020]  it will be useful, but then we will endanger our customers. PPML is there to define or find
[02:14.020 --> 02:22.360]  the optimized point between utility and privacy. How can we sacrifice less by getting enough privacy?
[02:23.460 --> 02:27.220]  My colleague will be talking about privacy attacks next.
[02:28.060 --> 02:32.920]  In terms of privacy attack, there are two main categories. One is the classical, where the
[02:32.920 --> 02:37.280]  private data is in the train, where it's in the raw set. And there are the ML-enabled one, which
[02:37.280 --> 02:42.020]  I'll be focusing more on. In the next slide, you can see that there are several attack surfaces
[02:42.020 --> 02:48.980]  that attackers can utilize to perform a privacy attack. The first is the physical domain,
[02:48.980 --> 02:54.400]  in terms of network protection, it could be attack network traffic. Then there's the digital
[02:54.400 --> 03:00.140]  representation, which is the TCP dump. There's also the machine learning model, which is just
[03:00.140 --> 03:06.840]  the model itself, where you have an input and do a prediction on the output. In terms of this,
[03:06.840 --> 03:10.120]  it's the attack probability. And then you have the physical domain, which is the shutdown
[03:10.120 --> 03:15.360]  infrastructure. For this talk, we'll focus on the machine learning model. In the next slide,
[03:15.360 --> 03:21.460]  we will see our first type of attack, which is the linkage attack. With this attack, in the early
[03:21.460 --> 03:26.780]  90s, Tina Sweeney found that she was able to identify the identification of the governor of
[03:26.780 --> 03:31.520]  Massachusetts by linking his health records to a voter information just by using three variables,
[03:31.520 --> 03:37.560]  his gender, date of birth, and a zip code. In the next slide, she was able to identify that
[03:38.100 --> 03:43.360]  using these three variables alone, she was able to uniquely identify 87% of the population in the
[03:43.360 --> 03:51.600]  U.S. using the census, the 1990 census. In 2000, Philip Goh identified that using these three
[03:51.600 --> 03:58.740]  records, he was able to identify only 64% of the population. This could be a result of people
[03:58.740 --> 04:05.840]  urbanizing more, moving from the rural areas to the cities. But this provides an important point
[04:05.840 --> 04:11.400]  where using just very simple features, attackers can identify back to the original source.
[04:11.930 --> 04:18.520]  In the next slide, we look at another popular linkage attack. In 2006, Netflix released a
[04:18.520 --> 04:23.300]  public data set, a competition where they're offering a prize to improve their recommendation.
[04:23.300 --> 04:29.560]  And their FAQs denoted that there is no private information in the data set itself,
[04:29.560 --> 04:35.120]  but researchers have found that if they were able to link the Netflix accounts back to the IMDb,
[04:35.120 --> 04:40.000]  just by using the name of the person on the IMDb, their public dates, and public ratings,
[04:40.000 --> 04:47.820]  they'll identify personable information back to Netflix. You might be wondering why this might be
[04:47.820 --> 04:57.120]  damaging since the people tend to put their public information on IMDb. This could be because they
[04:57.120 --> 05:04.100]  might be biasing or selecting a set of records to disclose on IMDb, but not on Netflix recommendation.
[05:04.100 --> 05:09.540]  For example, they were able to identify that for a particular person, they were able to identify
[05:09.540 --> 05:15.960]  just based on his Netflix record, that he might have a voting preference from his voting record on
[05:15.960 --> 05:21.020]  Power and Terror, Norm Chomsky and Our Times, and Periodic 9-11. They might be able to infer on his political
[05:21.020 --> 05:26.080]  preference based on the Jesus of Nazareth and Gospel of John ratings, and also his eating habit based
[05:26.080 --> 05:33.320]  on his recommendations on supersizing. In the next slide, we look at a new type of attack called
[05:33.320 --> 05:38.140]  the reconstruction attack. This is more prone to models that tend to keep the training information
[05:38.140 --> 05:45.560]  in the structure itself, such as near-neighbor classifier and kernel-based SVM. But this is also
[05:45.560 --> 05:51.760]  prone to neural networks as well, as we can see in the next slide, where in production environments,
[05:51.760 --> 05:57.820]  researchers and even implementers tend to reuse neural network models just because of something
[05:57.820 --> 06:05.340]  called style transfer, or style transfer where they're able to utilize one model and
[06:07.300 --> 06:11.900]  create a new prediction from that same model using the inputs and architecture of it,
[06:11.900 --> 06:17.860]  and reconstruct back to the original input. In the next slide, we also see a similar type of
[06:17.860 --> 06:23.700]  attack called model inversion attack, where in this case, a feature such as an image is fed into
[06:23.860 --> 06:28.820]  a neural network model and produces a set of probabilities, and then the researcher can use
[06:28.820 --> 06:34.120]  these probabilities and name and reconstruct an image back to the real thing. For example,
[06:34.120 --> 06:39.380]  in the right side of the slide, you can see that in the left image, there's this reconstructed image,
[06:39.380 --> 06:43.660]  and in the right is the original image, and they're very similar to each other.
[06:44.180 --> 06:48.860]  In the next slide, we also have a membership inference attack, where an adversary can
[06:48.860 --> 06:53.920]  generate multiple shadow network models that resembles the production environment,
[06:53.920 --> 06:59.000]  produce the probabilities from these models, create a new model called the attack model to
[06:59.000 --> 07:05.400]  feed in these probabilities, and learn what observations are trained by the production model
[07:05.400 --> 07:09.540]  and what are not trained by the production model, identify what are the members and what are not the
[07:09.540 --> 07:18.380]  members. In the next slide, we also have a core attack. This is more prone to models such as LSTM,
[07:18.380 --> 07:28.480]  or models that learn from a time series of data points. For this example, you might have a
[07:28.480 --> 07:32.020]  model that learns from social security, and it does an autocomplete, and one of the autocomplete
[07:32.020 --> 07:38.100]  could be some sort of sensitive information, such as the phone number or the social security of a
[07:38.100 --> 07:46.340]  person. This is more prone to maybe like if you're over-memorizing from the neural network model
[07:46.340 --> 07:52.980]  itself, where the records might appear at least five times. In the next slide, we have a summary
[07:52.980 --> 07:58.840]  of different privacy attacks that I've talked about earlier on. You could take a pause here
[07:58.840 --> 08:04.360]  and just look through the difference, compare the different approaches, and look at the examples,
[08:04.360 --> 08:10.950]  but we can move on if you're ready to move on. Next slide.
[08:11.830 --> 08:18.690]  All right, so now we're really talking about methods and techniques to preserve privacy.
[08:19.650 --> 08:27.890]  To preserve privacy, we have four main points to protect. The training data and input and output
[08:27.890 --> 08:33.910]  of the model shouldn't be visible to anyone who is not authorized to visit them. And finally,
[08:33.910 --> 08:39.930]  the model privacy, which makes sure that the model cannot be stolen by any malicious party.
[08:40.170 --> 08:47.110]  Two main categories to preserve privacy are secure computation and privacy preservation. In case of
[08:47.110 --> 08:54.090]  secure computation, we transform or distribute our computations in a way that is not readable
[08:54.090 --> 09:00.330]  by any unauthorized person. And in terms of privacy preservation techniques, we use injection,
[09:00.330 --> 09:04.950]  noise injection, or masking and hiding techniques in order to preserve privacy.
[09:05.790 --> 09:11.910]  The first method of privacy preservation is homomorphic encryption. Homomorphic encryption
[09:11.910 --> 09:18.610]  is a type of encryption where you are doing the same computations and a third party can
[09:18.610 --> 09:24.130]  do your computations for you, but not on the original form of the data, on an encrypted form
[09:24.130 --> 09:35.290]  of data. This means that if you use a third party company for your ML computations, you can totally
[09:35.290 --> 09:41.010]  give your data to them in a transformed form, which is using homomorphic encryption, and they
[09:41.010 --> 09:45.790]  can do the computations for you and give you back the results. And then you can decrypt the data
[09:45.790 --> 09:52.470]  and get the plain text or cipher text as you want. The second method is secure multi-party
[09:52.470 --> 09:59.730]  computation. In case of this technique, we have multiple parties who all contribute into building
[09:59.730 --> 10:07.550]  an ML model or a function or to compute a value for a function. But the thing is that the input
[10:07.550 --> 10:15.150]  is distributed between all of these parties. So as you see, part of input A1 is here, A2 is here,
[10:15.150 --> 10:23.690]  A3 is here, and so on. So this way, if an adversary wants to attack the outputs, they need to
[10:23.690 --> 10:29.850]  be able to hack into all of the parties in order to make sense of the data. Otherwise, getting
[10:29.850 --> 10:37.110]  information from one of the parties is meaningless. Another technique is federated machine learning.
[10:37.110 --> 10:43.470]  Federated machine learning is very similar to the previous method. However, the difference is that
[10:43.470 --> 10:52.030]  we have multiple sets of users, and these users all contribute into building a classifier.
[10:52.750 --> 10:59.250]  The way that we handle the creation of the outputs is that we assign some sort of weights
[10:59.250 --> 11:05.950]  and locally update the average of classifiers by assigning weights to each of the trained models.
[11:07.830 --> 11:12.550]  The most important technique to preserve privacy is differential privacy.
[11:12.930 --> 11:19.050]  Basically, differential privacy means that it's a noise injection technique so that
[11:19.050 --> 11:25.550]  the output of your model shouldn't reveal any information about the input of the model.
[11:25.550 --> 11:31.690]  Whether your model is trained using the data of this person or the data of this person
[11:31.690 --> 11:35.110]  shouldn't be revealed in the outcome of your algorithm.
[11:37.230 --> 11:44.270]  It can be non-interactive or interactive. For example, in case of non-interactive,
[11:44.270 --> 11:50.710]  we pre-compute the stats, the amount of noise, the amount of relationship between the input
[11:51.350 --> 11:57.970]  features, and then inject noise to that. In case of interactive, the amount of noise is
[11:59.270 --> 12:03.770]  injected based on user's requests.
[12:05.350 --> 12:11.910]  What does differential privacy mean? It means that if person i change their information
[12:12.510 --> 12:21.570]  from x i to x prime i, then the probability of their output should not change by that much.
[12:21.570 --> 12:30.010]  What does it mean again? It means that as a person, if one of my features is changing,
[12:30.010 --> 12:36.990]  and as a person who is contributing into training a model, that change shouldn't cause a huge change
[12:36.990 --> 12:42.510]  in the output. Because if that causes that huge change, it will reveal some information
[12:42.510 --> 12:50.770]  about this instance or observance as an input. For example, if I change the color of my hair
[12:50.770 --> 12:57.810]  from brown to green, and that has a huge effect on the output of the model, then just by observing
[12:57.810 --> 13:04.610]  that change, an adversary will know that I changed the color of my hair. So using differential
[13:04.610 --> 13:12.730]  privacy, we want to make that change as minimal as possible. And how do we do that? By computing
[13:12.730 --> 13:19.050]  what is the effect of that change and making that as a noise input into our model.
[13:19.870 --> 13:27.370]  So looking again at the differential privacy, you see here that we have two datasets. We call these
[13:27.370 --> 13:34.690]  neighbor datasets, meaning that all of the features in both datasets are actually equal,
[13:34.690 --> 13:42.210]  except that one feature. In here, it is x3 and x3 prime. We want to make sure that when we change
[13:42.210 --> 13:50.790]  the input data from x3 to x3 prime, the output that is generated by the model is changing
[13:50.790 --> 13:56.290]  minimally. So as you see, the probability of generating a specific output in the first case
[13:56.290 --> 14:03.930]  to the second case is minimal, or epsilon. The next method is called PATE, private
[14:03.930 --> 14:12.090]  aggregation of teacher ensembles. In this case, we have several classifiers that use some data
[14:12.090 --> 14:19.470]  use some features in order to train a model. Then we have two sets of voters. It's like an
[14:19.470 --> 14:25.270]  inversion programming type of technique. These voters choose the majority voting generated by
[14:25.270 --> 14:33.130]  all of these classifiers. And if we have equal amount of votes, for example, in this case,
[14:33.130 --> 14:40.730]  we have two cancers and two healthy results, then we will calculate a Gaussian noise to
[14:40.730 --> 14:47.270]  inject to the voters data. So basically, we again go back into our training data,
[14:47.270 --> 14:55.530]  and we take a look and see that how much noise does each classifier needs to be injected into.
[14:55.530 --> 15:02.770]  Then again, we look at we inject that noise and we get the results off of this chart based on the
[15:02.770 --> 15:08.330]  results. So based on the amount of noise injected, the model might generate the value cancer or
[15:08.330 --> 15:15.430]  might generate the value healthy. The important thing is that whatever the output is, for example,
[15:15.430 --> 15:21.610]  in this case, it is cancer, the adversary cannot correlate it back to the data that is trained
[15:21.610 --> 15:28.610]  in order to get the value of cancer here. Because even if we don't have we have the value healthy
[15:28.610 --> 15:33.670]  here by the noise injection, these might also generate the value cancer. So we don't really
[15:33.670 --> 15:40.070]  know is James Smith the cause of generating the result cancer or if someone similar to
[15:40.070 --> 15:46.670]  James Smith in this training data set is contributing into that result. Here is the
[15:46.670 --> 15:52.290]  comparison of the defense methods in terms of what they're emphasizing on, what are they trying to
[15:52.290 --> 15:59.170]  protect, whether it is the data owner or the model protection, and their use cases in different
[15:59.170 --> 16:05.390]  applications. Most importantly, most of these methods are really effective when they're used
[16:05.390 --> 16:10.570]  in combination with each other, one from each category. For example, homomorphic encryption
[16:10.570 --> 16:16.790]  and differential privacy or SMP and differential privacy, PATE and federated lightning plus
[16:16.790 --> 16:24.870]  homomorphic encryption. Here are the practical methods for privacy preservation. They are very
[16:24.870 --> 16:32.530]  easy to use. There are already packages that are generated for that like TensorFlow or PyTorch
[16:32.530 --> 16:39.670]  or IBM privacy package. We are going to show an example of how to use one of these packages in a
[16:39.670 --> 16:48.270]  real world sample. In this demo, we're going to show how you're going to implement a solution
[16:48.270 --> 16:52.610]  from an adversary point of view, and then from a defense point of view, how you can easily apply
[16:52.610 --> 16:59.350]  differentiable private method into your model. So here in the references, you'll see the data
[16:59.350 --> 17:05.050]  that we use, which is a data source on Kaggle of the purchase dataset. This contains information
[17:05.670 --> 17:11.330]  that you can use to create labels, such as a binary classifier. In some cases, it could be like
[17:11.330 --> 17:16.430]  fraud, or if it's fraudulent or not, or it could be abusive or not. Then we can also create a set
[17:16.430 --> 17:25.730]  of multi-class labels itself, such as the class for a customer, or it could be the category of a
[17:25.730 --> 17:32.690]  merchant, or any other multi-class that you can think of relevant to your use case. And you can
[17:32.690 --> 17:37.850]  see that we're also going to show you the TensorFlow privacy method as well. So first, we're going to
[17:37.850 --> 17:44.150]  look at how a shadow network is created. It's really easy. First, you create the set of shadow
[17:44.150 --> 17:50.450]  networks. Then you output the predictions from the shadow networks, feed that into a TAC network,
[17:50.450 --> 17:55.910]  and then train a target network that is similar to what you think the production environment would
[17:55.910 --> 18:02.010]  look like, and then see whether the prediction from that target network is predicted as an in-member
[18:02.010 --> 18:07.850]  or out-member from the TAC network. If we look at the shadow network, all you have to do is create
[18:08.130 --> 18:14.010]  a very simple for loop. So if you go down, you can see that all you have to do is create
[18:14.150 --> 18:19.870]  n number of shadow networks. They could be similar to each other, or they could be of different
[18:19.870 --> 18:26.690]  structure. But for our use case, we're going to create one type of TensorFlow model, a very simple
[18:26.690 --> 18:35.070]  one with an initial density of 64 nodes, and then dropout, and then another dense layer of 24 nodes,
[18:35.070 --> 18:41.350]  and then finally the softmax for a prediction of the multi-class or binary class. Each of these
[18:41.350 --> 18:48.410]  n shadow networks will have the same networks, and it'll produce a prediction of how likely it is
[18:48.410 --> 18:54.370]  to be in a certain class or not. If we look at the TAC network, this is where then you're feeding
[18:54.370 --> 18:59.930]  the predictions of the shadow networks. We're then training a very fine model to distinguish
[18:59.930 --> 19:05.610]  the threshold of whether a particular observation is within a network or not. So this is very
[19:05.610 --> 19:11.910]  simple. You're training samples from the shadow network. You're going to identify if it came from
[19:11.910 --> 19:18.810]  that particular shadow network or not, and that would be pretty much your label. If we go to the
[19:18.810 --> 19:26.370]  target DP, this will reveal how easy it is to implement a differential private method for your
[19:26.370 --> 19:31.510]  defense mechanism if you already have a deep learning TensorFlow model in production. So all
[19:31.510 --> 19:36.710]  you really have to do is import the TensorFlow privacy library and then update your optimizer.
[19:36.710 --> 19:41.610]  So for in cell 11, you can see that the model is the same, but in cell 12, you can see that
[19:41.610 --> 19:46.450]  the only thing that changes is the optimizer, which is a DP gradient descent constant optimizer
[19:46.450 --> 19:51.810]  with four inputs that you would have to fine-tune. Dial-to-norm clip, noise multiplier,
[19:51.810 --> 19:56.410]  non-micro-batches, and learning rate. And then you have your differential private method.
[19:59.170 --> 20:06.630]  And then if we go back to the first one, here we show results of the different number of shadow
[20:06.630 --> 20:11.550]  networks with the different number of classes, 10 classes, one-inch classes, and the binary two
[20:11.550 --> 20:15.630]  classes, and whether it was differential private or not. And you can see the different types of
[20:15.630 --> 20:24.230]  accuracy, test accuracy, and then the recall and precision rate as well. This is just a summary.
[20:24.230 --> 20:28.350]  If you're interested, you can pause here and look at it for further details.
[20:35.590 --> 20:40.190]  Okay, so this is the end of our presentation. Thank you so much for your time.
[20:40.190 --> 20:45.350]  If you have questions, please reach out to our emails included in the slide deck.
